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Abstract —Pansharpening aims at fusing a panchromatic image 
with a multispectral one, to generate an image with the high 
spatial resolution of the former and the high spectral resolution 
of the latter. In the last decade, many algorithms have been 
presented in the literature for pansharpening using multispectral 
data. With the increasing availability of hyperspectral systems, 
these methods are now being adapted to hyperspectral images. In 
this work, we compare new pansharpening techniques designed 
for hyperspectral data with some of the state of the art methods 
for multispectral pansharpening, which have been adapted for 
hyperspectral data. Eleven methods from different classes (com¬ 
ponent substitution, multiresolution analysis, hybrid, Bayesian 
and matrix factorization) are analyzed. These methods are ap¬ 
plied to three datasets and their effectiveness and robustness are 
evaluated with widely used performance indicators. In addition, 
all the pansharpening techniques considered in this paper have 
been implemented in a MATLAB toolbox that is made available 
to the community. 

Index Terms —Pansharpening, hyperspectral image, image fu¬ 
sion 


1. Introduction 

I N the design of optical remote sensing systems, owing to 
the limited amount of incident energy, there are critical 
tradeoffs between the spatial resolution, the spectral resolution, 
and signal-to-noise ratio (SNR). For this reason, optical sys¬ 
tems can provide data with a high spatial resolution but with 
a small number of spectral bands (for example, panchromatic 
data with decimetric spatial resolution or multispectral data 
with three to four bands and metric spatial resolution, like 
PLEIADES Q) or with a high spectral resolution but with 
reduced spatial resolution (for example, hyperspectral data, 
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subsequently referred to as HS data, with more than one hun¬ 
dred of bands and decametric spatial resolution like HYPXIM 
Q). To enhance the spatial resolution of multispectral data, 
several methods have been proposed in the literature under the 
name of pansharpening, which is a form of superresolution. 
Eundamentally, these methods solve an inverse problem which 
consists of obtaining an enhanced image with both high 
spatial and high spectral resolutions from a panchromatic 
image and a multispectral image. The huge interest of the 
community on this topic is evidenced by the existence of 
sessions dedicated to this topic in the most important remote 
sensing and Earth observation conferences as well as by the 
launch of public contests, of which the one sponsored by the 
data fusion committee of the IEEE Geoscience and Remote 
Sensing society is an example. 

A taxonomy of pansharpening methods can be found in the 
literature 0.0.0- They can be broadly divided into four 
classes: component substitution (CS), multiresolution analysis 
(MRA), Bayesian, and variational. The CS approach relies on 
the substitution of a component (obtained, e.g., by a spectral 
transformation of the data) of the multispectral (subsequently 
denoted as MS) image by the panchromatic (subsequently 
denoted as PAN) image. The CS class contains algorithms 
such as intensity-hue-saturation (IHS) [0, 0 0 principal 
component analysis (PC A) fT0| , fTT| , fl^ and Gram-Schmidt 
(GS) spectral sharpening UlSj. The MRA approach is based on 
the injection of spatial details, which are obtained through a 
multiscale decomposition of the PAN image into the MS data. 
The spatial details can be extracted according to several modal¬ 
ities of MRA: decimated wavelet transform (DWT) (H, un¬ 
decimated wavelet transform (UDWT) pS] , ”a-trous” wavelet 
transform (ATWT) fT^ , Laplacian pyramid (lz)> nonseparable 
transforms, either based on wavelets (e.g., contourlets p8| ) or 
not (e.g., curvelets G2l)- Hybrid methods have been also pro¬ 
posed, which use both component substitution and multiscale 
decomposition, such as guided filter PCA (GEPCA), described 
in Section [TLC) The Bayesian approach relies on the used of 
posterior distribution of the full resolution target image given 
the observed MS and PAN images. This posterior, which is the 
Bayesian inference engine, has two factors: a) the likelihood 
function, which is the probability density of the observed MS 
and PAN images given the target image, and b) the prior 
probability density of the target image, which promotes target 
images with desired properties, such as being segmentally 
smooth. The selection of a suitable prior allows us to cope 
with the usual ill-posedness of the pansharpening inverse 
problems. The variational class is interpretable as particular 
case of the Bayesian one, where the target image is estimated 
by maximizing the posterior probability density of the full 
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resolution image. The works pQ| , (D are representative 
of the Bayesian and variational classes. As indicated in Table 
1^ the CS, MRA, and Hybrid classes of methods are detailed 
in Sections II-A II-B[ and |II-C| respectively. Herein, the 
Bayesian class is not addressed in the MS+PAN context. It 
is addressed in detail, however, in Section |II-D| in the context 
of HS+PAN fusion. 

With the increasing availability of HS systems, the pan- 
sharpening methods are now extending to the fusion of HS 
and panchromatic images | [23| , p4| , p5| , | [26| . Pansharp- 
ening of HS images is still an open issue, and very few 
methods are presented in the literature to address it. The 
main advantage of HS image with respect to MS one is the 
more accurate spectral information they provide, which clearly 
benefits many applications such as unmixing p7| , change 
detection | [28| , object recognition p9| , scene interpretation 
p0| and classification ID- Several of the methods designed 
for HS pansharpening were originally designed for the fusion 
of MS and HS datap2|-p6|, the MS data constituting the high 
spatial resolution image. In this case, HS pansharpening can be 
seen as a particular case, where the MS image is composed of 
a single band, and thus reduces to a PAN image. In this paper, 
we divide these methods into two classes: Bayesian methods 
and matrix factorization based methods. In Section |II-D| we 
briefly present the algorithms of p3] , and p5| of the 
former class and in Section |II-E| the algorithm of |32| of the 
latter class. 

As one may expect, performing pansharpening with HS 
data is more complex than performing it with MS data. 
Whereas PAN and MS data are usually acquired almost in 
the same spectral range, the spectral range of an HS image 
normally is much wider than the one of the corresponding 
PAN image. Usually, the PAN spectral range is close to 
the visible spectral range of 0.4 — 0.8/im (for example, the 
advanced land imager-ALI-instrument acquires PAN data in 
the range 0.48 — 0.69/im). The HS range often covers the 
visible to the shortwave infrared (SWIR) range (for example, 
Hyperion acquires HS data in the range 0.4 —2.5/im, the range 
0.8 —2.5/im being not covered by the PAN data). The difficulty 
that arises, consists in defining a fusion model that yields good 
results in the part of the HS spectral range that is not covered 
by PAN data,in which the high resolution spatial information 
is missing. This difficulty already existed, to some extent, in 
MS-i-PAN pansharpening, but it is much more severe in the 
HS-fPAN case. 

To the best of the authors’ knowledge, there is currently 
no study comparing different fusion methods for HS data, 
particularly on datasets where the spectral domain of the HS 
image is larger than the one of the PAN image. This work 
aims at addressing this specific issue. The remainder of the 
paper is organized as follows. Section II reviews the methods 
under study, i.e., CS, MRA, hybrid, Bayesian, and matrix 
decomposition approaches. Section III summarizes the quality 
assessment measures that will be used to assess the image 
fusion results. Experimental results are presented in Section 
IV. Conclusions are drawn in Section V. 


TABLE I 

Summary of the different classes of methods considered in 
THIS PAPER. Within parentheses, we indicate the acronym of 

EACH METHOD, FOLLOWED BY THE NUMBER OF THE SECTION IN WHICH 
THAT METHOD IS DESCRIBED. 


Methods originally designed eor MS Pansharpening 

Component substitution (CS, 

IH3 

Trmcipal Component Analysis 
(PCA, II-Al} 

Multiresolution analysis (MRA, 
0 

smoothing filter-based intensity 
modulation (SEIM, II-BH 

Laplacian pyramid ll-BZ] 

Gram kcnmidt (GS, II-A2| 

Hybrid methods jll-Cj 

Guided Filter PCA (GEPCA) 

Bayesian methods 

Not discussed in this paper 


Methods originally designed eor HS Pansharpening 

Bayesian Methods jll-D 

Naive Gaussian prior Jll-bl ^ 

Sparsity promoting prior Jll-UZj 
HySure jlI-D3| 

Matrix Factorization jlI-E| 

Coupled Non-negative Matrix Fac¬ 
torization (CNMF) 


H. Hyperspectral Pansharpening Techniques 

This section presents some of the most relevant methods 
for HS pansharpening. First, we focus on the adaptation of 
the popular CS and MRA MS pansharpening methods for HS 
pansharpening. Later, we consider more recent methods based 
on Bayesian and matrix factorization approaches. A toolbox 
containing MATLAB implementations of these algorithms can 
be found onlin^B 

Before presenting the different methods, we introduce no¬ 
tation used along the paper. Bold-face capital letters refer to 
matrices and bold-face lower-case letters refer to vectors. The 
notation refers to the kth row of X. The operator ()^ 
denotes the transposition operation. Images are represented by 
matrices, in which each row corresponds to a spectral band, 
containing all the pixels of that band arranged in lexicographic 
order. We use the following specific matrices: 

• X = [cci,..., Xn] G xn j'epresents the full resolution 
target image with mx bands and n pixels; X represents 
an estimate of that image. 

. Yh G Ym G and P G 

represents, respectively, the observed HS, MS, and PAN 
images, nx denoting the number of bands of the MS 
image and m the total number of pixel in the Yh image. 

• Yh G M’^Axn j'epresents the HS image Yh interpolated 
at the scale of the PAN image. 

We denote by d = x/mfn the down-sampling factor, 
assumed to be the same in both spatial dimensions. 

A. Component Substitution 

CS approaches rely upon the projection of the higher 
spectral resolution image into another space, in order to 
separate spatial and spectral information j^. Subsequently, the 
transformed data are sharpened by substituting the component 
that contains the spatial information with the PAN image (or 
part of it). The greater the correlation between the PAN image 

^ http://openremotesensing.net 
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and the replaced component, the less spectral distortion will 
be introduced by the fusion approach As a consequence, 
a histogram-matching procedure is often performed before 
replacing the PAN image. Finally, the CS-based fusion process 
is completed by applying the inverse spectral transformation 
to obtain the fused image. 

The main advantages of the CS-based fusion techniques are 
the following: i) high fidelity in rendering the spatial details 
in the final image | [J7} , ii) fast and easy implementation [sj, 
and in) robustness to misregistration errors and aliasing 
On the negative side, the main shortcoming of this class of 
techniques is the generation of a significant spectral distortion, 
cause by the spectral mismatch between the PAN and the HS 
spectral ranges 0 - 

Following 0, p9| , a formulation of the CS fusion scheme 
is given by 


X'==Y^+5fe(P-Oi), (1) 

for k = 1,... ,mA, where denotes the kth band of the 
estimated full resolution target image, g = [^i,.. 
a vector containing the injection gains, and Ol is defined as 

mx ^ 

Ol = ( 2 ) 

i=l 

where the weights v^ = [ici,..., ..., measure 

the spectral overlap among the spectral bands and the PAN 
image 0, | |40| . 

The CS family includes many popular pansharpening ap¬ 
proaches. In p6| , three approaches based on principal compo¬ 
nent analysis (PCA) 0 and Gram-Schmidt (H), p7| transfor¬ 
mations have been compared for sharpening HS data. A brief 
description of these techniques follows. 

1) Principal Component Analysis: PCA is a spectral 
transformation widely employed for pansharpening applica¬ 
tions 0 . It is achieved through a rotation of the original data 
(i.e., a linear transformation) that yields the so-called principal 
components (PCs). The hypothesis underlying its application 
to pansharpening is that the spatial information (shared by all 
the channels) is concentrated in the first PC, while the spectral 
information (specific to each single band) is accounted for 
the other PCs. The whole fusion process can be described by 
the general formulation stated by Eqs. Q and 0, where the 
vectors w and g of coefficient vectors are derived by the PCA 
procedure applied to the HS image. 

2) Gram-Schmidt: The Gram-Schmidt transformation, of¬ 
ten exploited in pansharpening approaches, was initially pro¬ 
posed in a patent by Kodak (m The fusion process starts by 
using, as the component, a synthetic low resolution PAN image 
II at the same spatial resolution as the HS imag^ A complete 
orthogonal decomposition is then performed, starting with that 
component. The pansharpening procedure is completed by 
substituting that component with the PAN image, and inverting 
the decomposition. This process is expressed by Q using the 

^GS is a more general method than PCA. PCA can be obtained, in GS, by 
using the first PC as the low resolution panchromatic image pTj . 


gains p7| 


_ cov(Y^,Ol) 
var(OL) 


( 3 ) 


for k = where cov(-,-) and var(-) denote the 

covariance and variance operations. Different algorithms are 
obtained by changing the definition of the weights in 0- 
The simplest way to obtain this low-resolution PAN image 
simply consists of averaging the HS bands (i.e., by setting 
Wi = Ifmx, for i = 1,..., mx). In 1^ , the authors proposed 
an enhanced version, called GS Adaptive (GSA), in which 1l 
is generated by the linear model in Q with weights estimated 
by the minimization of the mean square error between the 
estimated component and a filtered and downsampled version 
of the PAN image. 


B. Multiresolution Analysis 

Pansharpening methods based on MRA apply a spatial 
filter to the PAN image for generating details to be injected 
into the HS data. The main advantages of the MRA-based 
fusion techniques are the following: i) temporal coherence 0 
(see Sect.27.4.4), ii) spectral consistency, and Hi) robustness 
to aliasing, under proper conditions | [38| . On the negative 
side, the main shortcomings are i) the implementation is 
more complicated due to the design of spatial filters, ii) the 
computational burden is usually larger when compared to CS 
approaches. The fusion step is summarized as 0 (ig 

X'' = Y^ + Gfc ® (P - Pi) , (4) 

for k = l,...,mA, where Pl denotes a low-pass version 
of P, and the symbol (g) denotes element-wise multiplication. 
Furthermore, an equalization between the PAN image and the 
HS spectral bands is often required. P — P^ is often called 
the details image, because it is a high-pass version of P, and 
Eq. 0 can be seen as describing the way to inject details 
into each of the bands of the HS image. According to (0, the 
approaches belonging to this category can differ in i) the type 
of PAN low pass image Pl that is used, and ii) the definition 
of the gain coefficients G/.. Two common options for defining 
the gains are: 

1) G/c = lfor/c = l,..., mx, where 1 is an appropriately 
sized matrix with all elements equal to 1. This choice 
identifies the so-called additive injection scheme; 

2) Gk = Y^0PI, for /c = 1,..., mx, where the symbol 0 
denotes element-wise division. In this case, the details 
are weighted by the ratio between the upsampled HS 
image and the low-pass filtered PAN one, in order 
to reproduce the local intensity contrast of the PAN 
image in the fused image . This coefficient selection 
is often referred to as high pass modulation (HPM) 
method or multiplicative injection scheme. Some pos¬ 
sible numerical issues could appear due to the division 
between and P^ for low value of P^ creating 
fused pixel with very high value. In our toolbox this 
problem is addressed by clipping these values by using 
the information given by the dynamic range. 

In the case of HS pansharpening, some further consider¬ 
ations should be taken into account. Indeed, the PAN and 
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HS images are rarely acquired with the same platform. Thus, 
the ratio between the spatial resolutions of the PAN and HS 
images may not always be an integer number, or a power of 
two. This implies that some of the conventional approaches 
initially developed for MS images cannot be extended in a 
simple way to HS images (for example, dyadic wavelet-based 
algorithms cannot be applied in these conditions). 

1) Smoothing Filter-based Intensity Modulation (SFIM): 
The direct implementation of Eq. 0 consists of applying a 
single linear time-invariant (LTI) low pass filter (LPF) Hlp to 
the PAN image P for obtaining P/,. Therefore, we have 

=Yli+gk(P-P*hLp) (5) 

for /c = 1,..., mx, where the symbol * denotes the convolu¬ 
tion operator. The SFIM algorithm | [43| sets hpp to a simple 
box (i.e., an averaging) filter and exploits HPM as the details 
injection scheme. 

2) Laplacian Pyramid: The low-pass filtering needed to 
obtain the signal Pl at the original HS scale can be performed 
in more than one step. This is commonly referred to as 
pyramidal decomposition and dates back to the seminal work 
of Burt and Adelson GD If a Gaussian filter is used to low- 
pass filter the images in each step, one obtains a so-called 
Gaussian pyramid. The differences between consecutive levels 
of a Gaussian pyramid define the so-called Laplacian pyramid. 
The suitability of the latter to the pansharpening problem 
has been shown in @4). Indeed, Gaussian filters can be 
tuned to closely match the sensor modulation transfer function 
(MTF). In this case, the unique parameter that characterizes 
the whole distribution is the Gaussian’s standard deviation, 
which is determined from sensor-based information (usually 
from the value of the amplitude response at the Nyquist 
frequency, provided by the manufacturer). Both additive and 
multiplicative details injection schemes have been used in this 
framework | [42| , | [45| . They will be referred to as MTF - 
Generalized Laplacian Pyramid (MTF-GFP) | [45| and MTF- 
GLP with High Pass Modulation (MTF-GFP-HPM) | [42| , 
respectively. 

C. Hybrid Methods 

Hybrid approaches use concepts from different classes of 
methods, namely from CS and MRA ones, as explained next. 

1) Guided Filter PCA (GFPCA): One of the main chal¬ 
lenges for fusing low-resolution HS and high-resolution 
PAN/RGB data is to find an appropriate balance between spec¬ 
tral and spatial preservation. Recently, the guided filter | [46| has 
been used in many applications (e.g. edge-aware smoothing 
and detail enhancement), because of its efficiency and strong 
ability to transfer the structures of the guidance image to the 
filtering output. Its application to HS data can be found in ED’ 
where the guided filter was applied to transfer the structures 
of the principal components of the HS image to the initial 
classification maps. 

Here, we briefiy describe an image fusion framework which 
uses a guided filter in the PCA domain (GFPCA) | [48| . 
The approach won the “Best Paper Challenge” award at 
the 2014 IEEE data fusion contest | [48| , by fusing a low 


spatial resolution thermal infrared HS image and a high spatial 
resolution, visible RGB image associated with the same scene. 
Fig. shows the framework of GFPCA. Instead of using 
CS, which may cause spectral distortions, GFPCA uses a 
high resolution PAN/RGB image to guide the filtering process 
aimed at obtaining superresolution. In this way, GFPCA does 
not only preserve the spectral information from the original 
HS image, but also transfers the spatial structures of the high 
resolution PAN/RGB image to the enhanced HS image. To 
speed up the processing, GFPCA first uses PCA to decorrelate 
the bands of the HS image, and to separate the information 
content from the noise. The first p ^ mx PCA channels 
contain most of the energy (and most of the information) of 
an HS image, and the remaining mx —p PCA channels mainly 
contain noise (recall that mx is the number of spectral bands of 
the HS image). When applied to these noisy (and numerous) 
mx — p channels, the guided filter amplifies the noise and 
causes a high computational cost in processing the data, which 
is undesirable. Therefore, guided filtering is used to enlarge 
only the first k PCA channels, preserving the structures of 
the PAN/RGB image, while cubic interpolation is used to 
upsample the remaining channels. 

Fet PC^, with (i < p), denote the ith PC channel obtained 
from the HS image Yh, with its resolution increased to that 
of the guided image Y (Y may be a PAN or an RGB image) 
through bicubic interpolation. The output of the filtering, PC-, 
can be represented as an affine transformation of Y in a local 
window ujj of size {2d + 1) x {2d + 1) as follows: 

pc' = cl jY + hj^ Vi G ujj. (6) 

The above model ensures that the output PC- has an edge 
only if the guided image Y has an edge, since V(PC-) = 
aVY. The following cost function is used to determine the 
coefficients aj and bj\ 

E{aj,bj) = [(«iY + bj - PCi)2 + ea^] , (7) 

i^LUj 

where e is a regularization parameter determining the degree 
of blurring for the guided filter. For more details about the 
guided filtering scheme, we invite the reader to consult [ [46| . 
The cost function E leads the term OjY + hj to be as close 
as possible to PC^, in order to ensure the preservation of the 
original spectral information. Before applying inverse PCA, 
GFPCA also removes the noise from the remaining PCA 
channels PC Rest using a soft-thresholding scheme (similarly 
to iE3)> and increases their spatial resolution to the resolution 



Fig. 1. Fusion of HS and PAN/RGB images with the GFPCA framework. 
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of the PAN/RGB image using cubic interpolation only (without 
guided filtering). 

D. Bayesian Approaches 

The fusion of HS and high spatial resolution images, e.g., 
MS or PAN images, can be conveniently formulated within 
the Bayesian inference framework. This formulation allows an 
intuitive interpretation of the fusion process via the posterior 
distribution of the Bayesian fusion model. Since the fusion 
problem is usually ill-posed, the Bayesian methodology offers 
a convenient way to regularize the problem by defining an 
appropriate prior distribution for the scene of interest. Fol¬ 
lowing this strategy, different Bayesian estimators for fusing 
co-registered high spatial-resolution MS and high spectral- 
resolution HS images have been designed p^-p6|, m- 
(541 . The observation models associated with the HS and MS 
images can be written as follows (501 , | [55| , (561 


Yh=XBS + Nh ... 

Ym = RX + Nm 

where X, Yh, and Ym were defined in Section |n| and 

. B G is a cyclic convolution operator, correspond¬ 

ing to the spectral response of the HS sensor expressed 
in the resolution of the MS or PAN image, 

• S G is a down-sampling matrix with down- 

sampling factor d, 

. Rg ^nxxmx jg spectral response of the MS or PAN 
sensor, 

• Nh and Nm are the HS and MS noises, assumed to 
have zero mean Gaussian distributions with covariance 
matrices Ah and Am, respectively. 

For the sake of generality, the formulation in this section as¬ 
sumes that the observed data is the pair of matrices (Yh, Ym). 
Since a PAN image can be represented by Ym with nx = 1, 
the observation model ^ covers the HS-fPAN fusion problem 
considered in this paper. 

Using geometrical considerations well grounded in the HS 
imaging literature devoted to the linear unmixing problem 
(Z7l , the high spatial resolution HS image to be estimated is 
assumed to live in a low dimensional subspace. This hypothe¬ 
sis is very reliable when the observed scene is composed of a 
finite number of macroscopic materials (called endmembers). 
Based on the model ^ and on the low dimensional subspace 
assumptions, the distributions of Yh and Ym can be expressed 
as follows 

Yh|U - A^Ar^„^(HUBS, AH,Im), .g. 

Ym|U - AfA/',„,(RHU, AM,In) 

where A4JV represents the matrix normal distribution (57| , the 
target image is X = HU, with H G containing in 

its columns a basis of the signal subspace of size fhx mx 
and U G contains the representation coefficients of X 

with respect to H. The subspace transformation matrix H can 
be obtained via different approaches, e.g., PC A (5^ or vertex 
component analysis (59| . 

According to Bayes’ theorem and using the fact that the 
noises Nh and Nm are independent, the posterior distribution 


of U can be written as 


p (U| Yh, Ym) ^ p (Yh|U) p (Ym |U) p (U) (10) 


or equivalentl}0 


-logp(U|YH,YM) = JiIAhVYh -HUBS)||^ + 


HS data term 
=logp(YH|U) 


-||Am"(Ym-RHU)| 

'-V- 

MS data term 
=logp(YM|U) 


- X4>(u) 

regularizer 

=logp(U) 


( 11 ) 

where ||X||i? ^Tr(XX^) is the Frobenius norm of X. 
An important quantity in the negative log-posterior O is the 
penalization term 0(U) which allows the inverse problem <0 
to be regularized. The next sections discuss different ways of 
defining this penalization term. 

1) Naive Gaussian prior: Denote as (i = 1, • • • , n) the 
columns of the matrix U that are assumed to be mutually 
independent and are assigned the following Gaussian prior 
distributions 


( 12 ) 

where is a fixed image defined by the interpolated HS 
image projected into the subspace of interest, and is an un¬ 
known hyperparameter matrix. Different interpolations can be 
investigated to build the mean vector pi^. In this paper, we have 
followed the strategy proposed in (^ . To reduce the number 
of parameters to be estimated, the matrices are assumed to 
be identical, i.e., Xi = • • • = = X. The hyperparameter 

X is assigned a proper prior and is estimated jointly with the 
other parameters of interest. To infer the parameter of interest, 
namely the projected highly resolved HS image U, from 
the posterior distribution p(U|Yh,Ym), several algorithms 
have been proposed. In (3^ , a Markov chain Monte 

Carlo (MCMC) method is exploited to generate a collection 
of Nmc samples that are asymptotically distributed according 
to the target posterior. The corresponding Bayesian estimators 
can then be approximated using these generated samples. For 
instance, the minimum mean square error (MMSE) estimator 
of U can be approximated by an empirical average of the 
generated samples Ummse « n„c-Nk where 

Abi is the number of burn-in iterations required to reach the 
sampler convergence, and is the image generated in 

the tth iteration. The highly-resolved HS image can finally 
be computed as Xmmse = HUmmse- An extension of the 
proposed algorithm has been proposed in (5^ to handle the 
specific scenario of an unknown sensor spectral response. In 
a deterministic counterpart of this MCMC algorithm has 
been developed, where the Gibbs sampling strategy of (3^ 
has been replaced with a block coordinate descent method to 
compute the maximum a posteriori (MAP) estimator. Finally, 
very recently, a Sylvester equation-based explicit solution of 

^We use the symbol = to denote equality apart from an additive constant. 
The additive constants are irrelevant, since the functions under consideration 
are to be optimized, and the additive constants don’t change the locations of 
the optima. 
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the related optimization problem has been derived in 1^ , 
leading to a significant decrease of the computational com¬ 
plexity. 

2) Sparsity promoted Gaussian prior: Instead of incorpo¬ 

rating a simple Gaussian prior or smooth regularization for 
the HS and MS fusion | [50| , | [5T| , a sparse represen¬ 
tation can be used to regularize the fusion problem. More 
specifically, image patches of the target image (projected 
onto the subspace defined by H) are represented as a sparse 
linear combination of elements from an appropriately chosen 
over-complete dictionary with columns referred to as atoms. 
Learning the dictionary from the observed images instead of 
using predefined bases generally improves image 

representation | [65| , which is preferred in most scenarios. 
Therefore, an adaptive sparse image-dependent regularization 
can be explored to solve the fusion problem ([8]). In p6) , the 
following regularization term was introduced: 

-j 

(/.(U) (X - logp (U) = - ^ ||Ufe - V (D;iAfe) 11^ , (13) 

^ k=l 

where 

• U/c G is the kih band (or row) of U G 

k = l,.. .,fhx, 

• V{') : 1 -^ is a linear operator that averages 

the overlapping patche^ of each band, npat being the 
number of patches associated with the ith band, 

is the overcomplete dictionary, whose 
columns are basis elements of size (corresponding to 
the size of a patch), nat being the number of dictionary 
atoms, and 

m Ai e M^atxnpat jg Qf band. 

Inspired by hierarchical models frequently encountered in 
Bayesian inference | [67| , a second level of hierarchy can 
be considered in the Bayesian paradigm by including the 
code A within the estimation, while fixing the support Cl = 
{Ai,..., Cli^^ } of the code A. Once D, Cl and H have been 
learned from the HS and MS data, maximizing the posterior 
distribution of U and A reduces to a standard constrained 
quadratic optimization problem with respect to (w.r.t.) U and 
A. The resulting optimization problem is difficult to solve 
due to its large dimension and due to the fact that the linear 
operators H(-)BD and V{-) cannot be easily diagonalized. 
To cope with this difficulty, an optimization technique that 
alternates minimization w.r.t. U and A has been introduced 
in (where details on the learning of t). Cl and H can be 
found). In | [6T| , the authors show that the minimization w.r.t. 
U can be achieved analytically, which greatly accelerates the 
fusion process. 

3) HySure: The works p5| , | [54| introduce a convex reg¬ 
ularization problem which can be seen under a Bayesian 
framework. The proposed method uses a form of vector total 
variation (VTV) for the regularizer 0(U), taking into 
account both the spatial and the spectral characteristics of 
the data. In addition, another convex problem is formulated 
to estimate the relative spatial and spectral responses of the 

decomposition into overlapping patches was adopted, to prevent the 
occurrence of blocking artifacts (^. 


sensors B and R from the data themselves. Therefore, the 
complete methodology can be classified as a blind superreso¬ 
lution method, which, in contrast to the classical blind linear 
inverse problems, is tackled by solving two convex problems. 

The VTV regularizer (see | [68| ) is given by 

n rhx 
j = l \ k = l 

where A^ denotes the element in the kth row and jth column 
of matrix A, and the products by matrices Y>h and 
compute the horizontal and vertical discrete differences of an 
image, respectively, with periodic boundary conditions. 

The HS pansharpened image is the solution to the following 
optimization problem 


minimize 

u 


i||YH-HUBs||^ 



2 


Ym - RHU 


2 

F 


(15) 


where and control the relative weights of the different 
terms. The optimization problem ( p~5] ) is hard to solve, es¬ 
sentially for three reasons: the downsampling operator BS is 
not diagonalizable, the regularizer 0(U) is nonquadratic and 
nonsmooth, and the target image has a very large size. These 
difficulties were tackled by solving the problem via the split 
augmented lagrangian shrinkage algorithm (SALSA) | [69| , an 
instance of ADMM. As an alternative, the main step of the 
ADMM scheme can be conducted using an explicit solution 
of the corresponding minimization problem, following the 
strategy in (M). 

The relative spatial and spectral responses B and R were 
estimated by solving the following optimization problem: 

minimize ||RYh — Y]V[BS|| + A^0^(B) + Ai^0i^(R) 

B,R 

(16) 

where 0 b(-) and 0r(') are quadratic regularizers and A^, A^^ > 
0 are their respective regularization parameters. 


E. Matrix factorization 

The matrix factorization approach for HS-fMS fusion es¬ 
sentially exploits two facts: 1) A basis or dictionary H for 
the signal subspace can be learned from the HS observed 
image Yh, yielding the factorization X = HU; 2) using this 
decomposition in the second equation of (|^ and for negligible 
noise, i.e., Nm — 0, we have Yh = RHU. Assuming that 
the columns of RH are full rank or that the columns of U 
admit a sparse representation w.r.t. the columns of RH, then 
we can recover the true solution, denoted by U, and use it to 
compute the target image as X = HU. The works |[^, 

| [74| are representative of this line of attack. In what follow, 
we detail the application of the coupled nonnegative matrix 
factorization (CNMF) to the HS-fPAN fusion problem. 

The CNMF was proposed for the fusion of low spatial 
resolution HS and high spatial resolution MS data to produce 
fused data with high spatial and spectral resolutions p^ . It 
is applicable to HS pansharpening as a special case, in which 
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the higher spatial resolution image has a single band | [75| . 
CNMF alternately unmixes both sources of data to obtain the 
endmember spectra and the high spatial resolution abundance 
maps. 

To describe this method, it is convenient to first briefiy 
introduce linear mixture models for HS images. These models 
are commonly used for spectral unmixing, owing to their 
physical effectiveness and mathematical simplicity p7| . The 
spectrum at each pixel is assumed to be a linear combination 
of several endmember spectra. Therefore, X G 
formulated as 


In order to be able to use a reference image for quality 
assessment, one normally has to resort to the use of semi¬ 
synthetic HS and PAN images. In this case, a real-life HS 
image is used as reference. The HS and PAN images to be 
processed are obtained by degrading this reference image. A 
common methodology for obtaining the degraded images is 
Wald’s protocol, described in the next subsection. In order 
to evaluate the quality of the fused image with respect to 
the reference image, a number of statistical measures can be 
computed. The most widely used ones are described ahead. 


and used in the experiments reported in Section IV-B 


X = HU (17) 

where H G is the signature matrix, containing the 

spectral representations of the endmembers, and U G 
is the abundance matrix, containing the relative abundances 
of the different endmembers at the various pixels, with p 
representing the number of endmembers. By substituting ( p^ 
into Yh and Ym can be approximated as 


Yh « HUh 
Ym « HmU 


( 18 ) 


where Ur = UBS and Hm = RH. CNMF alternately 
unmixes Yr and Ym in the framework of nonnegative matrix 
factorization (NMF) | [76l to estimate H and U under the 
constraints of the relative sensor characteristics. CNMF starts 
with NMF unmixing of the low spatial resolution HS data. 
The matrix H can be initialized using, for example, the vertex 
component analysis (VCA) and H and Ur are then 

alternately optimized by minimizing ||Yr — HUr|||. using 
Lee and Seung’s multiplicative update rules | [76| . Next, U 
is estimated from the higher spatial resolution data. Hm is 
set to RH and U is initialized by the spatially up-sampled 
matrix of Ur obtained by using bilinear interpolation. For 
HS pansharpening (nA=l), only U is optimized by minimizing 
||Ym — HmU|||. with the multiplicative update rule, whereas 
both Hm and U are alternately optimized in the case of 
HS-fMS data fusion. Finally, the high spatial resolution HS 
data can be obtained by the multiplication of H and U. 
The abundance sum-to-one constraint is implemented using 
a method given in (TTj, where the data and signature matrices 
are augmented by a row of constants. The relative sensor 
characteristics, such as BS and R, can be estimated from 
the observed data sources fT^ . 


HI. Quality Assessment of fusion products 

Quality assessment of a pansharpened real-life HS image 
is not an easy task | [79| , ||^, since a reference image is 
generally not available. When such an image is not available, 
two kinds of comparisons can be performed: i) Each band 
of the fused image can be compared with the PAN image, 
with an appropriate criterion. The PAN image can also be 
compared with the PAN image reconstructed from the fused 
image, ii) The fused image can be spatially degraded to the 
resolution of the original HS image. The two images can then 
be compared, to assess to what extent the spectral information 
has been modified by the fusion method. 


A. Wald's Protocol 

A general paradigm for quality assessment of fused images 
that is usually accepted in the research community was first 
proposed by Wald et al. | [79| . This paradigm is based on 
two properties that the fused data have to have, as much as 
possible, namely consistency and synthesis properties. The 
first property requires the reversibility of the pansharpening 
process: it states that the original HS image should be obtained 
by properly degrading the pansharpened image. The second 
property requires that the pansharpened image be as similar 
as possible to the image of the same scene that would be 
obtained, by the same sensor, at the higher resolution. This 
condition entails that both the features of each single band 
and the mutual relations among bands have to be preserved. 
However, the definition of an assessment method that fulfills 
these constraints is still an open issue | [80| , and closely 
relates to the general discussion regarding image quality 
assessment | [8^ and image fusion [8^ , | [84| . 

Wald’s protocol for assessing the quality of pansharpen¬ 
ing methods | [79| , depicted in Fig. synthetically generates 
simulated observed images from a reference HS image, and 
then evaluates the pansharpening methods’ results against that 
reference image. The protocol consists of the following steps: 

• Given a HS image, X, to be used as reference, a sim¬ 
ulated observed low spatial resolution HS image, Yr, 
is obtained by applying a Gaussian blur to X, and then 
downsampling the result by selecting one out of every 
d pixels in both the horizontal and vertical directions, 
where d denotes the downsampling factor. 

• A simulated PAN image, P, is obtained by multiplying 
the reference HS image, on the left, by a suitably chosen 
spectral response vector, P = r^X. 

• The pansharpening method to be evaluated is applied 
to the simulated observations Yr and P, yielding the 
estimated superresolution HS image, X. 

• Finally, the estimated superresolution HS image and the 
reference one are compared, to obtain quantitative quality 
measures. 


B. Quality Measures 

Several quality measures have been defined in the literature, 
in order to determine the similarity between estimated and 
reference spectral images. These measures can be generally 
classified into three categories, depending on whether they 
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where, given the vectors a, b G , 

SAM(a, b) = arccos > (21) 

(a, b) = a^b is inner product between a and b, and || • || is the 
£2 norm. SAM is a measure of the spectral shape preservation. 
The optimal value of SAM is 0. The values of SAM reported 
in our experiments have been obtained by averaging the values 
obtained for all the image pixels. 

3) Root mean squared error: The RMSE measures the £2 
error between the two matrices X and X 

RMSE(X, X) = (22) 

^n*mx 

where ||X||i7 = ^trace(X^X) is the Frobenius norm of X. 
The ideal value of RMSE is 0. 

4) Erreur relative globale adimensionnelle de synthese: 
ERGAS offers a global indication of the quality of a fused 
image. It is defined as 

ERGAS(X, X) = 100 d, (23) 


Fig. 2. Flow diagram of the experimental methodology, derived from Wald’s 
protocol (simulated observations), for synthetic and semi-real datasets. 


attempt to measure the spatial, spectral or global quality of 
the estimated image. This review is limited to the most widely 
used quality measures, namely the cross correlation (CC), 
which is a spatial measure, the spectral angle mapper (SAM), 
which is a spectral measure, and the root mean squared 
error (RMSE) and erreur relative globale adimensionnelle de 
synthese (ERGAS) | [85| , which are global measures. Below 
we provide the formal defeitions of these measures operating 
on the estimated image X G and on the reference 

HS image X G definitions, xy and Xj denote 

the jth columns of X and X, respectively, the matrices 
A,B G denote two generic single-band images, and 

Ai denotes the ith element of A. 

1) Cross correlation: CC, which characterizes the geomet¬ 
ric distortion, is defined as 

-| rnx 

CC(X,X) =-Vccs(xyx©, (19) 

mx ^ 

1=1 

where CCS is the cross correlation for a single-band image, 
defined as 

- Hb) 

where, pA = (1/f^) Y^^=i the sample mean of A. The 

ideal value of CC is 1. 

2) Spectral angle mapper: SAM, which is a spectral mea¬ 
sure, is defined as 

1 ^ 

SAM(X,X) = - y]SAM(xj,Xj), (20) 

ri . 
j=i 



where d is the ratio between the linear resolutions of the PAN 
and HS images, defined as 


PAN linear spatial resolution 
HS linear spatial resolution 


RMSE/, = 


IIX^ -X^ 


-, /i/c is the sample mean of the kth 


band of X. The ideal value of ERGAS is 0. 


IV. Experimental Results 

A. Datasets 

The datasets that were used in the experimental tests were 
all semi-synthetic, generated according to the Wald’s protocol. 
In all cases, the spectral bands corresponding to the absorption 
band of water vapor, and the bands that were too noisy, were 
removed from the reference image before further processing. 
Three real-life HS images have been used as reference images 
for the Wald protocol. In the following, we describe the 
datasets that were generated from these images. Table 
summarizes their properties. These datasets are expressed in 
spectral luminance (nearest to the sensor output, without pre¬ 
processing) and are correctly registered. 

TABLE II 

Caracteristic of the three datasets 


dataset 

dimensions 

spatial res 

N 

instrument 

Moffett 

PAN 185 X 395 

HS 37 X 79 

20m 

100m 

224 

AVIRIS 

Camargue 

PAN 500 X 500 

HS 100 X 100 

4m 

20m 

125 

HyMap 

Carons 

PAN 400 X 400 

HS 80 X 80 

4m 

20m 

125 

HyMap 
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1) Moffett field dataset: This dataset represents a mixed 
urban/rural scene. The dimensions of the PAN are 185 x 395 
with a spatial resolution of 20m whereas the size of the HS 
image is 37 x 79 with a spatial resolution of 100m (which 
means a spatial resolution ratio of 5 between the two images). 
The HS image has been acquired by the airborne hyperspec- 
tral instrument airborne visible infrared image spectrometer 
(AVIRIS). This instrument is characterized by 224 bands 
covering the spectral range 0.4 — 2.5/im. 

2) Camargue dataset: This dataset represents a rural area 
with different kind of crops. The dimensions of the PAN 
image are 500 x 500 pixels with a spatial resolution of 4m 
whereas the size of the HS image is 100 x 100 pixels with a 
spatial resolution of 20m, (which means a spatial resolution 
ratio of 5 between the two images). The HS image has been 
acquired by the airborne hyperspectral instrument HyMap 
(Hyperspectral Mapper) in 2007. The hyperspectral instrument 
is characterized by 125 bands covering the spectral range 
0.4 — 2.5/im. 

3) Garons dataset: This dataset represents a rural area with 
a small village. The dimension of the PAN image are 400 x 400 
with a spatial resolution of 4m whereas the size of the HS 
image is 80 x 80 with a spatial resolution of 20m, (which 
means a spatial resolution ratio of 5 between the two images). 
This dataset has been acquired with the HyMap instrument in 
2009. 


B. Results and Discussion 


Methods presented in Section have been applied on 
the three datasets presented in Section |IV-A| and analyzed 
following the Wald’s Protocol (Section [TiPBl TTables[ml |I^[Vl 
report their quantitative evaluations with respect to the quality 
measures detailed in Section IIII-BI 


TABLE III 

Quality measures eor the Moeeett eield dataset 


method 

CC 

SAM 

RMSE 

ERGAS 

SFIM 

0.92955 

9.5271 

365.2577 

6.5429 

MTF-GLP 

0.93919 

9.4599 

352.1290 

6.0491 

MTF-GLP-HPM 

0.93817 

9.3567 

354.8167 

6.1992 

GS 

0.90521 

14.1636 

443.4351 

7.5952 

GSA 

0.93857 

11.2758 

363.7090 

6.2359 

PCA 

0.89580 

14.6132 

463.2204 

7.9283 

GFPCA 

0.91614 

11.3363 

404.2979 

7.0619 

CNMF 

0.95496 

9.4177 

314.4632 

5.4200 

Bayesian Naive 

0.97785 

7.1308 

220.0310 

3.7807 

Bayesian Sparse 

0.98168 

6.6392 

200.3365 

3.4281 

HySure 

0.97059 

7.6351 

254.2005 

4.3582 


Figures [3 00 represent the RMSEs per pixel between 
the image estimated by some methods and the reference 
image for the three considered datasets. Note that, for sake of 
conciseness, some methods have not been considered here but 
only their improved versions are presented. More specifically, 
GS has been removed since GSA is an improved version of 
GS. Indeed, GSA is expected to give better results than GS 


TABLE IV 

Quality measures eor the Camargue dataset 


method 

CC 

SAM 

RMSE 

ERGAS 

SFIM 

0.91886 

4.2895 

637.1451 

3.4159 

MTF-GLP 

0.92397 

4.3378 

622.4711 

3.2666 

MTF-GLP-HPM 

0.92599 

4.2821 

611.9161 

3.2497 

GS 

0.91262 

4.4982 

665.0173 

3.5490 

GSA 

0.92826 

4.1950 

587.1322 

3.1940 

PCA 

0.90350 

5.1637 

710.3275 

3.8680 

GFPCA 

0.89042 

4.8472 

745.6006 

4.0001 

CNMF 

0.9300 

4.4187 

591.3190 

3.1762 

Bayesian Naive 

0.95195 

3.6428 

489.5634 

2.6286 

Bayesian Sparse 

0.95882 

3.3345 

448.1721 

2.4712 

HySure 

0.9465 

3.8767 

511.8525 

2.8181 


TABLE V 

Quality measures eor the Garons dataset 


method 

CC 

SAM 

RMSE 

ERGAS 

SFIM 

0.77052 

6.7356 

1036.4695 

5.1702 

MTF-GLP 

0.80124 

6.6155 

956.3047 

4.8245 

MTF-GLP-HPM 

0.79989 

6.6905 

962.1076 

4.8280 

GS 

0.80347 

6.6627 

1037.6446 

5.1373 

GSA 

0.80717 

6.7719 

928.6229 

4.7076 

PCA 

0.81452 

6.6343 

1021.8547 

5.0166 

GFPCA 

0.63390 

7.4415 

1312.0373 

6.3416 

CNMF 

0.82993 

6.9522 

893.9194 

4.4927 

Bayesian Naive 

0.86857 

5.8749 

784.1298 

3.9147 

Bayesian Sparse 

0.87834 

5.6377 

750.3510 

3.7629 

HySure 

0.86080 

6.0224 

778.1051 

4.0454 


thanks to its adaptive estimation of the weight for generating 
the equivalent PAN image from the HS image, which allows 
the spectral distortion to be reduced. Bayesian naive approach 
has been also removed since the sparsity-based approach relies 
on a more complex prior and gives also better results. MTF- 
GLP and MTF-GLP-HPM yield similar results so only the 
latter has been considered. 

Figures and [ 7 ] show extracts of the final result obtained 
by the considered methods on the Camargue dataset in the 
visible (R= 704.39nm, G= 557.90nm, B= 454.5nm) and 
in the SWIR (R= 1216.7nm, G= 1703.2nm, B= 2159.8nm) 
domains, respectively. 

Figures and show pixel spectra recovered by the 
fusion methods, which correspond to 10th, 50th and 90th 
percentile of RMSE, respectively. Those spectra have been 
selected by choosing GSA as the reference for RMSE value. 
GSA have been chosen since it is a classical approach that 
has been widely used in literature and also gives good results. 
To ensure a reasonable number of figures, only visual results 
and some spectra of the Camargue dataset has been reported 
in this article. The results for the two other datasets can be 
found in the supporting document available onlin^ In 

“ http://openremotesensing.net 
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Fig. 3. RMSE between the methods’ result and the reference image, per 
pixel for the Moffett held dataset 



Fig. 4. RMSE between the methods’ result and the reference image, per 
pixel for the Camargue dataset 



Fig. 5. RMSE between the methods’ result and the reference image, per 
pixel for the Garons dataset 


particular, because of the nature of the Garons dataset (village 
with lot of small buildings) and the chosen ratio of 5, worse 
results have been obtained than for the two first datasets since 



(i) (j) 


Fig. 6. Details of original and fused Camargue dataset HS image in the 
visible domain, (a) reference image, (b) interpolated HS image, (c) SFIM, 
(d) MTF-GLP-HPM, (e) GSA, (f) PCA, (g) GFPCA, (h) CNMF, (i) Bayesian 
Sparse, (j) HySure 


a lot of mixing is presented in the HS image. 

A visual analysis of the result shows that most of the fusion 
approaches considered in this paper give good results, excepted 
two methods: PCA and GFPCA. PCA belongs to the class of 
CS methods which are known to be characterized by their high 
fidelity in rendering the spatial details but their generation of 
significant spectral distortion. This is clearly visible in Figure 
(f), where significant differences of color can be observed 
with respect to the reference image, in particular when exam¬ 
ining the different fields. GFPCA here also performs poorly. 
Compared with PCA, there is less spectral distortion but the 
included spatial information seems to be not sufficient, since 
the fused image is significantly blurred. Spatial information 
provided by PCA is better since the main information of HS 
image (where the spatial information is contained) is replaced 
by the high spatial information contained in the PAN image. 
When using GFPCA, the guided filter controls the amount of 
spatial information added to the data, so not all the spatial 
information may be added to avoid to modify the spectral 
information too much. For the Moffett field dataset, GFPCA 
performs a little bit better since, in this dataset, there is a lot 
of large areas. Thus blur is less present whereas, in the Garons 
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Fig. 7. Details of original and fused Camargue dataset HS image in the 
SWIR domain, (a) reference image, (b) interpolated HS image, (c) SFIM, (d) 
MTF-GLP-HPM, (e) GSA, (f) PCA, (g) GFPCA, (h) CNMF, (i) Bayesian 
Sparse, (j) HySure 



Fig. 8. Luminance of the pixel corresponding to the 10th percentile of the 
RMSE (Camargue dataset) 


dataset, GFPCA performs worse since this image consists of 
numerous small features, leading to more blurring effects. As a 
consequence, in this case, GFPCA performs worse than PCA. 
To analyze the spectrum in detail, chosen thanks to RMSE 



Fig. 9. Luminance of the pixel corresponding to the 50th percentile of the 
RMSE (Camargue dataset) 



wavelength 


Fig. 10. Luminance of the pixel corresponding to the 90th percentile of the 
RMSE (Camargue dataset) 


percentiles, some additional information about the correspond¬ 
ing pixels are needed. Fig. corresponds to a pixel in the 
reference image which represents a red building. Since in the 
HS image this building is mixed with its neighborhood, we do 
not have the same information between the reference image 
(“pure” spectrum) and the HS image (“mixed” spectrum). Fig. 

corresponds to a pixel in a vegetation area, no mixing is 
present and very good results have been obtained for all the 
methods. For Fig. [T^ the pixel belongs to a small building 
not visible in the HS image and spectral mixing is then 
also present. More generally, spectra in the HS and reference 
images differ since some mixing processes occur in the HS 
image. Thus, the HS pansharpening methods are expected to 
provide spectra that are more similar to the HS spectra (which 
contains the available spectral information) than the reference 
(which has information missing in the HS which should not 
be found in the result, unless successful unmixing has been 
conducted). However, it is important to note that for Figj^ 
Bayesian methods and HySure successfully recover spectra 
that are more similar to the reference spectrum than the HS 
spectrum. 
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Table |V^ report the computational times required by each 
HS pansharpening methods on the Camargue dataset those 
values have been obtained with a Intel Core i5 3230M 2.6 
GHz with 8 GB RAM. Based on this table, these methods can 
be classified as follows: 


• Methods which do not work well for HS pansharpening: 
PCA, GS, GFPCA 

• Methods which work well with a low time of computation 
(few seconds): GSA, MRA methods, Bayesian Naive 

• Methods which work well with an average time of 
computation (around one minute): CNMF 

• Methods which work well (slightly better) with an impor¬ 
tant time of computation (few minutes, depends greatly 
on the size of the dataset): Bayesian Sparse and HySure. 


TABLE VI 

Computational times of the different methods (in seconds) 


method 

Moffett 

Camargue 

Garons 

SFIM 

1.26 

3.47 

2.74 

MTF-GLP 

1.86 

4.26 

4.00 

MTF-GLP-HPM 

1.71 

4.25 

2.98 

GS 

4.77 

8.29 

5.56 

GSA 

5.52 

8.73 

5.99 

PCA 

3.46 

8.92 

6.09 

GFPCA 

2.58 

8.51 

4.36 

CNMF 

10.98 

47.54 

23.98 

Bayesian Naive 

1.31 

7.35 

3.07 

Bayesian Sparse 

133.61 

485.13 

259.44 

HySure 

140.05 

296.27 

177.60 


To summarize, the comparison of the different methods 
performances for RMSE curves and quality measures confirms 
than PCA and GFPCA does not provide good results for 
HS pansharpening (GFPCA is know to perform much better 
on HS-i-RGB data). The other methods perform well, with 
Bayesian approaches having slightly better result but with a 
higher computational cost. The favorable fusion performance 
obtained by the Bayesian methods can be explained, in part, 
by the fact that they rely on a forward modeling of the PAN 
and HS images and explicitly exploit the spatial and spectral 
degradations applied to the target image. However, these 
algorithms may suffer from performance discrepancies when 
the parameters of these degradations (i.e., spatial blurring 
kernel, sensor spectral response) are not perfectly known. In 
particular, when these parameters are fully unknown and need 
to be fixed, they can be estimated jointly with the fused image, 
as in | [5^ , or estimated from the MS and HS images in a pre¬ 
processing step, following the strategies in | [78| or p5| . CS 
methods are fast to compute and easy to implement. They 
provide good spatial results but poor spectral results with 
significant spectral distortions, in particular when considering 
PCA and GS. GSA provides better results than the two other 
methods thanks to its adaptive weight estimation reducing the 
spectral distortion of the equivalent PAN image created from 
the HS image. MRA methods are fast, MTF-based methods 
give better results than SFIM and perform as well as the most 


competitive algorithms with higher computational complexity. 
SFIM does not perform as well than the other MRA methods 
since it used a box filter which should give less good result. In 
our experimentations, results from SFIM are not so different 
from those obtained with the MTF-based methods. This may 
come from the fact that semi-synthetic datasets are used so 
MTF may not be fully used to its potential. Table |VII| report 
these pro and cons associated with each HS pansharpening 
method. 

Finally, note that, in our experimentations, no registration 
error and temporal misalignment have been considered, which 
suggests that the robustness of the different methods has 
not been fully analyzed. When such problems may occur, 
CS and MRA methods may perform better thanks to their 
great robustness. In particular, CS methods are robust against 
misregistration error and aliasing, whereas MRA approaches 
are robust against aliasing and temporal misalignment. It is 
also worthy to note that the quality of a fusion method should 
also be related to a specific application (such as classification 
or target detection). Indeed, a method providing images with 
good performance metrics may or may not be the best for this 
specific application. 


V. Conclusion 

In this paper a qualitative and quantitative comparison 
of 11 different HS pansharpening methods was conducted, 
considering classical MS pansharpening techniques adapted 
to the HS context, and methods originally designed for HS 
pansharpening. More precisely, five classes of methods were 
presented: CS, MRA, Hybrid, Bayesian and matrix factoriza¬ 
tion. Those methods were evaluated on three different datasets 
representative of various scenario: mixed urban/rural area, 
rural area and urban area. 

A careful analysis of their performances suggested a classifi¬ 
cation of these methods into four groups: i) Methods with poor 
fusion results (CS-based methods and GFPCA), ii) Methods 
with good fusion performances and low computational costs 
(MRA methods, GSA and Bayesian naive) that may be suit¬ 
able for fusing large scale images, which is often the case 
for spacebome hyperspectral imaging missions, iii) Methods 
with good fusion performances and reasonable computational 
costs (CNMF), iv) Methods with slightly better fusion results 
but with higher computational costs (HySure and Bayesian 
Sparse). Those results were obtained with semi-synthetic 
datasets with no registration error or temporal misalignment. 
Thus robustness of the methods against these issues were 
not taken into account. When such problems may happen, 
different results could be obtained and classical pansharpening 
methods (CS and MRA) may give better results thanks to their 
robustness to these specific issues. 

The experiments and the quality measures presented in this 
paper were performed using MATLAB implementations of the 
algorithms. A MATLAB toolbox is made available onlin^to 
the community to help improving and developing new HS 
pansharpening methods and to facilitate comparison of the 
different methods. 

^ http://openremotesensing.net 
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TABLE VII 

Pros and Cons of each methods 


method 

pros 

cons 

SEIM 

II.B.l 

1) Low computational 
complexity 

1) Reduced performance 
when compared to MTE 
methods (since it uses a 
box filter) 

MTF-GLP 

ILB.2 

1) Performs well 

2) Low computational 
complexity 


MTF-GLP-HPM 

ILB.2 

1) Performs well 

2) Low computational 
complexity 


GS 

ILA.2 

1) Spatial information 
is well preserved 

2) Low computational 
complexity 

3) Easy implementation 

1) Low performance 
for HS images 

2) Significant spectral 
distortion 

GSA 

ILA.2 

1) Spatial information 
is well preserved 

2) Spectral distortion 
is reduced (compared 
to GS) 

3) Low computational 
complexity 

4) Easy implementation 


PCA 

11. A. I 

1) Spatial information 
is well preserved 

2) Low computational 
complexity 

3) Easy implementation 

1) Low performance 
for HS images 

2) Significant spectral 
distortion 

GFPCA 

ILC.l 

1) Spectral information 
is well preserved 

2) Low computational 
complexity 

1) Low performance 
for HS images (work 
better with RGB images) 

2) Not enough spatial 
information added (lot 
of blur) 

CNMF 

ILE.l 

1) Good results 
(spatial and spectral) 

1) Sensor characteristics 
required 

2) Medium computational 

cost 

Bayesian Naive 
II.D.l 

1) Good results 
(spatial and spectral) 

2) Low computational 
complexity 

1) Sensor characteristics 
required 

Bayesian Sparse 
ILD.2 

1) Good results 
(spatial and spectral) 

1) high computational 

cost 

2) Sensor characteristics 
required 

HySure 

II.D.3 

1) Good results 
(spatial and spectral) 

1) high computational 

cost 
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